Disfluency detection based on prosodic features for university lectures

نویسندگان

  • Henrique Medeiros
  • Helena Moniz
  • Fernando Batista
  • Isabel Trancoso
  • Luís Nunes
چکیده

This paper focuses on the identification of disfluent sequences and their distinct structural regions, based on acoustic and prosodic features. Reported experiments are based on a corpus of university lectures in European Portuguese, with roughly 32h, and a relatively high percentage of disfluencies (7.6%). The set of features automatically extracted from the corpus proved to be discriminant of the regions contained in the production of a disfluency. Several machine learning methods have been applied, but the best results were achieved using Classification and Regression Trees (CART). The set of features which was most informative for cross-region identification encompasses word duration ratios, word confidence score, silent ratios, and pitch and energy slopes. Features such as the number of phones and syllables per word proved to be more useful for the identification of the interregnum, whereas energy slopes were most suited for identifying the interruption point.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparing Different Machine Learning Approaches for Disfluency Structure Detection in a Corpus of University Lectures∗

This paper presents a number of experiments focusing on assessing the performance of different machine learning methods on the identification of disfluencies and their distinct structural regions over speech data. Several machine learning methods have been applied, namely Naive Bayes, Logistic Regression, Classification and Regression Trees (CARTs), J48 and Multilayer Perceptron. Our experiment...

متن کامل

Prosodic contex-based analysis of disfluencies

This work explores prosodic cues of disfluencies in a corpus of university lectures. Results show three significant (p < 0.001) trends: pitch and energy slopes are significantly different between the disfluency and the onset of fluency; those features are also relevant to disfluency type differentiation; and they do not seem to be a speakereffect. The best combination of linguistic features one...

متن کامل

Prosodic context-based analysis of disfluencies

This work explores prosodic cues of disfluencies in a corpus of university lectures. Results show three significant (p < 0.001) trends: pitch and energy slopes are significantly different between the disfluency and the onset of fluency; those features are also relevant to disfluency type differentiation; and they do not seem to be a speakereffect. The best combination of linguistic features one...

متن کامل

Spontaneous Mandarin Speech Recognition with Disfluencies Detected by Latent Prosodic Modeling (LPM)

In this paper, a new approach for improved spontaneous Mandarin speech recognition using Latent Prosodic Modeling (LPM) for disfluency interruption point (IP) detection is presented. The basic idea is to detect the disfluency interruption points (IPs) prior to the recognition, and then to incorporate these information into the recognition process via the second pass rescoring. For accurate dete...

متن کامل

Analysis of disfluencies in a corpus of university lectures

This paper analyzes the prosodic properties of disfluencies and of their contexts in a corpus of university lectures. Results show that there is a general tendency to repair fluency by means of prosodic contrast marking strategies (pitch and energy increase), regardless of the specific disfluency type, but still there are degrees in the contrast made by certain types. As for tempo patterns, the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013